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The present invention relates to the embedding of robust identification codes in 
electronic, optical and physical media, and the subsequent, objective discernment of such 
codes for identification purposes even after intervening distortion or corruption of the media. 

The invention is illustrated with reference to a few exemplary applications, including 
electronic imagery, emulsion film, and paper currency, but is not so limited. 



7 would never put it in the power of any printer or publisher to suppress or 
alter a work of mine, by making him master of the copy n 

Thomas Paine, Rights of Man, 1792. 

"The printer dares not go beyond his licensed copy " 



Since time immemorial, unauthorized use and outright piracy of audio and visual 
source material has caused lost revenues to the owners of such material, and has been a 
source of confusion and corruption of original work. 

With the advent of digitizing data audio signals and images, the technology of 
copying materials and redistributing them in an unauthorized manner has reached new heights 
of sophistication, and more importantly, omnipresence. Lacking objective means for 
comparing an alleged copy of material with the original, owners and possible litigation 
proceedings are left with a subjective opinion of whether the alleged copy is stolen, or has 
been used in an unauthorized manner. Furthermore, there is no simple means of tracing a 
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path to an original purchaser of the material, something which can be valuable in tracing 
where a possible "leak" of the material first occurred. 

A variety of methods for protecting commercial material have been attempted. One 
is to scramble signals via an encoding method prior to distribution, and descramble prior to 
use. This technique, however, requires that both the original and later descrambled signals 
never leave closed and controlled networks, lest they be intercepted and recorded. 
Furthermore, this arrangement is of little use in the broad field of mass marketing audio and 
visual material, where even a few dollars extra cost causes a major reduction in market, and 
where the signal must eventually be descrambled to be perceived and thus can be easily 
recorded. 

Another class of techniques relies on modification of source audio or video signals to 
include a subliminal identification signal, which can be sensed by electronic means. Examples 
of such systems are found in U.S. Patent 4,972,471 and European patent publication EP 
441,702, as well as in Komatsu et al, "Authentication System Using Concealed Image in 
Telematics," Memoirs of the School of Science & Engineering, Waseda University, No. 52, 
p. 45-60 (1988) (Komatsu uses the term "digital watermark" for this technique). An 
elementary introduction to these methods is found in the article "Digital Signatures," Byte 
Magazine, November, 1993, p. 309. These techniques have the common characteristic that 
deterministic signals with well defined patterns and sequences within the source material 
convey the identification information. For certain applications this is not a drawback. But in 
general, this is a highly inefficient form of embedding identification information for a variety 
of reasons: (a) the whole of the source material is not used; (b) deterministic patterns have a 
higher likelihood of being discovered and removed by a would-be infringer; and (c) the 
signals are not generally 'holographic* in that identifications may be difficult to make given— 
only sections of the whole. ('Holographic' is used herein to refer to the property that the 
identification information is distributed globally throughout the coded signal, and can be fully 
discerned from an examination of even a fraction of the coded signal. Coding of this type is 
sometimes termed "distributed" herein.) 

What is needed is a reliable and efficient method for performing a positive 
identification between a copy of an original signal and the original. This method should not 
only be able to perform positive identification, it should also be able to relate version 
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identification of sold copies in order to better pinpoint the point of sale. The method should 
not compromise the innate quality of material which is being sold, as does the placement of 
localized logos on images. The method should be robust so that an identification can be made 
even after multiple copies have been made and/or compression and decompression of the 
signal has taken place. The identification method should be largely uneraseable or 
"uncrackable." The method should be capable of working even on fractional pieces of the 
original signal, such as a 10 second "riff of an audio signal or the "clipped and pasted" 
sub-section of an original image. 

The existence of such a method would have profound consequences on audio and 
image piracy in that it could (a) cost effectively monitor for unauthorized uses of material and 
perform "quick checks"; (b) become a deterrent to unauthorized uses when the method is 
known to be in use and the consequences well publicized; and (c) provide unequivocal proof 
of identity, similar to fingerprint identification, in litigation, with potentially more reliability 
than that of fingerprinting. 

In accordance with an exemplary embodiment of the invention, a computer system is 
provided with associated means for manipulating either digital audio signals or digital images. 
In cases where original material is in "non-digital" form, such as on audio tape or on a 
photograph, means for creating a high fidelity digital copy of the material is included in the 
illustrative embodiment. This physical system will be referred to as the "Eye-D" workstation 
or system which serves as a concise trade name. The Eye-D system embeds an imperceptible 
global signal either direcdy onto the digital original or onto the "digitized copy" of the 
original if it was in a non-digital form to begin with. The new copy with the embedded signal 
becomes the material which is sold while the original is secured in a safe place. The new 
copy will be nearly identical to the original except under the finest of scrutiny; thus, its ~" 
commercial value will not be compromised. After the new copy has been sold and distributed 
and potentially distorted by multiple copies, the present disclosure details a method for 
positively identifying any suspect signal against the original. 

It is the use of identification signals which are global (holographic) and which mimic 
natural noise sources which are two important inter-related features which distinguish the 
present invention from the collective prior art. This approach allows the maximization of 
identification signal energy as opposed to merely having it present 'somewhere in the original 



40333.pa 3/17/94 

material.* This allows it to be much more robust in the face of thousands of real world 
degradation processes and material transformations such as cutting and cropping of imagery. 

The foregoing and additional features and advantages of the present invention will be 
more readily apparent from the following detailed description thereof, which proceeds with 
reference to the accompanying drawings. 

Brief Description of the Drawings 

Fig. 1 is a simple and classic depiction of a one dimensional digital signal which is 
discretized in both axes. 

Fig. 2 is a general overview, with detailed description of steps, of the process of 
embedding an "imperceptible" identification signal onto another signal. 

Fig. 3 is a step-wise description of how a suspected copy of an original is identified, 
provided that original and its copies are using the Eye-D identification system methodology. 

Fig. 4 is a schematic view of an apparatus for pre-exposing film with identification 
information in accordance with another embodiment of the present invention. 

Detailed Description 

In the following discussion of an illustrative embodiment, the words "signal" and 
"image" are used interchangeably to refer to both one, two, and even beyond two dimensions 
of digital signal. Examples will routinely switch back and forth between a one dimensional 
audio-type digital signal and a two dimensional image-type digital signal. 

In order to fully describe the details of an illustrative embodiment of the invention, it 
is necessary first to describe the basic properties of a digital signal. Fig. 1 shows a classic 
representation of a one dimensional digital signal. The x-axis defines the index numbers of" 
sequence of digital "samples," and the y-axis is the instantaneous value of the signal at that 
sample, being constrained to exist only at a finite number of levels defined as the "binary 
depth" of a digital sample. The example depicted in Fig. 1 has the value of 2 to the fourth 
power, or "4 bits," giving 16 allowed states of the sample value. 

For audio information such as sound waves, it is commonly accepted that the 
digitization process discretizes a continuous phenomena both in the time domain and in the 
signal level domain. As such, the process of digitization itself introduces a fundamental error 
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source in that it cannot record detail smaller than the discretization interval in either domain. 
The industry has referred to this, among other ways, as "aliasing" in the time domain, and 
"quantization noise" in the signal level domain. Thus, there will always be a basic error floor 
of a digital signal. Pure quantization noise, measured in a root mean square sense, is 
theoretically known to have the value of one over the square root of twelve, or about 0.29 
DN, where DN stands for 'Digital Number* or the finest unit increment of the signal level. 
For example, a perfect 12-bit digitizer will have 4096 allowed DN with an innate root mean 
square noise floor of —0.29 DN. 

All known physical measurement processes add additional noise to the transformation 
of a continuous signal into the digital form. The quantization noise typically adds in 
quadrature (square root of the mean squares) to the "analog noise" of the measurement 
process, as it is sometimes referred to. 

With almost all commercial and technical processes, the use of the decibel scale is 
used as a measure of signal and noise in a given recording medium. The expression "signal- 
to-noise ratio" is generally used, as it will be in this disclosure. As an example, this 
disclosure refers to signal to noise ratios in terms of signal power and noise power, thus 20 
dB represents a 10 times increase in signal amplitude. 

In summary, the presently preferred embodiment of the invention embeds an N-bit 
value onto an entire signal through the addition of a very low amplitude encodation signal 
which has the look and characteristics of pure noise. N is usually at least 8 and is capped on 
the higher end by ultimate signal-to-noise considerations and "bit error" in retrieving and 
decoding the N-bit value. As a practical matter, N is chosen based on application specific 
considerations, such as the number of unique different "signatures" that are desired. To 
illustrate, if N= 128, then the number of unique digital signatures is in excess of 10 AA 38 ~" 
(2 AA 128). This number is believed to be more than adequate to both identify the material 
with sufficient statistical certainty and to index exact sale and distribution information. 

The amplitude or power of this added signal is determined by the aesthetic and 
informational considerations of each and every application using the Eye-D method. For 
instance, non-professional video can stand to have a higher embedded signal level without 
becoming noticeable to the average human eye, while very high precision audio may only be 
able to accept a relatively small signal level lest the human ear perceive an objectionable 
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increase in "hiss." These statements are generalities and each application has its own set of 
criteria in choosing the signal level of the embedded identification signal. The higher the 
level of embedded signal, the more corrupted a copy can be and still be identified. On the 
other hand, the higher the level of embedded signal, the more objectionable the perceived 
5 noise might be, potentially impacting the value of the distributed material. 

A definition of terms is now in order: 

The ori ginal signal refers to either the original digital signal or the high quality 
digitized copy of a non-digital original. 

The N-bit identification word refers to a unique identification binary value, typically 

10 having N range anywhere from 8 to 128, which is the identification code ultimately placed 

onto the original signal via the disclosed transformation process. In the preferred 
embodiment, each N-bit identification word begins with the sequence of values '0101,' which 
is used to determine an optimization of the signal-to-noise ratio in the identification procedure 
of a suspect signal (see definition below). 

15 The m'th bit value of the N-bit identification word is either a zero or one 

corresponding to the value of the m'th place, reading left to right, of the N-bit word. E.g., 
the first (m= 1) bit value of the N= 8 identification word 01 1 10100 is the value '0;' the 
second bit value of this identification word is T, etc. 

The m'th individual embedded code signal refers to a signal which has dimensions 

20 and extent precisely equal to the original signal (e.g. both are a 512 by 512 digital image), 

and which is (in the illustrated embodiment) an independent pseudo-random sequence of 
digital values. "Pseudo" pays homage to the difficulty in philosophically defining pure 
randomness, and also indicates that there are various acceptable ways of generating the 
"random" signal. There will be exactly N individual embedded code signals associated with" 

25 any given original signal. 

The acceptable perceived noise level refers to an application-specific determination of 
how much "extra noise," i.e. amplitude of the composite embedded code signal described 
next, can be added to the original signal and still have an acceptable signal to sell or 
otherwise distribute. This disclosure uses a 1 Db increase in noise as a typical value which 

30 might be acceptable, but this is quite arbitrary. 
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The composite embedded code signal refers to the signal which has dimensions and 
extent precisely equal to the original signal, (e.g. both are a 512 by 512 digital image), and 
which contains the addition and appropriate attenuation of the N individual embedded code 
signals. The individual embedded signals are generated on an arbitrary scale, whereas the 
amplitude of the composite signal must not exceed the pre-set acceptable perceived noise 
level, hence the need for "attenuation" of the N added individual code signals. 

The distributable sig nal refers to the nearly similar copy of the original signal, 
consisting of the original signal plus the composite embedded code signal. This is the signal 
which is distributed to the outside community, having only slightly higher but acceptable 
"noise properties" than the original. 

A suspect sipnal refers to a signal which has the general appearance of the original 
and distributed signal and whose potential identification match to the original is being 
questioned. The suspect signal is then applied to the decoding process of Eye-D to see if it 
matches the N-bit identification word. 

The detailed methodology of the preferred embodiment begins by stating that the 
N-bit identification word is encoded onto the original signal by having each of the m bit 
values multiply their corresponding individual embedded code signals, the resultant being 
accumulated in the composite signal, the folly summed composite signal then being 
attenuated down to the acceptable perceived noise amplitude, and the resultant composite 
signal added to the original to become the distributable signal. 

The original signal, the N-bit identification word, and all N individual embedded 
code signals are then stored away in a secured place. A suspect signal is, then found. This 
signal may have undergone multiple copies, compressions and decompressions, resamplings 
onto different spaced digital signals, transfers from digital to analog back to digital media, or 
any combination of these items. IF the signal still appears similar to the original, i.e. its 
innate quality is not thoroughly destroyed by all of these transformations and noise additions, 
then depending on the signal to noise properties of the embedded signal, the identification 
process should function to some objective degree of statistical confidence. The extent of 
corruption of the suspect signal and the original acceptable perceived noise level are two key 
parameters in determining an expected confidence level of identification. 
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The identification process on the suspected signal begins by resampling and aligning 
the suspected signal onto the digital format and extent of the original signal. Thus, if an 
image has been reduced by a factor of two, it needs to be digitally enlarged by that same 
factor. Likewise, if a piece of music has been "cut out," but may still have the same 
sampling rate as the original, it is necessary to register this cut-out piece to the original, 
typically done by performing a local digital cross-correlation of the two signals (a common 
digital operation), finding at what delay value the correlation peaks, then using this found 
delay value to register the cut piece to a segment of the original. 

Once the suspect signal has been sample-spacing matched and registered to the 
original, the signal levels of the suspect signal must be matched in an rms sense to the signal 
level of the original. This can be done via a search on the parameters of offset, amplification, 
and gamma being optimized by using the minimum of the mean squared error between the 
two signals as a function of the three parameters. We can call the suspect signal normalized 
and registered at this point, or just normalized for convenience. 

The newly matched pair then has the original signal subtracted from the normalized 
suspect signal to produce a difference signal. The difference signal is then cross-correlated 
with each of the N individual embedded code signals and the peak cross-correlation value 
recorded. The first four bit code ('0101') is used as a calibrator both on the mean values of 
the zero value and the one value, and on further registration of the two signals if a finer 
signal to noise ratio is desired (i.e., the optimal separation of the 0101 signal will indicate an 
optimal registration of the two signals and will also indicate the probable existence of the 
N-bit identification signal being present.) 

The resulting peak cross-correlation values will form a noisy series of floating point 
numbers which can be transformed into 0*s and l's by their proximity to the mean values oFO 
and 1 found by the 0101 calibration sequence. If the suspect signal has indeed been derived 
from the original, the identification number resulting from the above process will match the 
N-bit identification word of the original, bearing in mind either predicted or unknown "bit 
error" statistics. Signal-to-noise considerations will determine if there will be some kind of 
"bit error" in the identification process, leading to a form of X% probability of identification 
where X might be desired to be 99.9% or whatever. If the suspect copy is indeed not a copy 
of the original, an essentially random sequence of 0*s and l's will be produced, as well as an 
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apparent lack of separation of the resultant values. This is to say, if the resultant values are 
plotted on a histogram, the existence of the N-bit identification signal will exhibit strong 
bi-level characteristics, whereas the non-existence of the code, or the existence of a different 
code of a different original, will exhibit a type of random gaussian-like distribution. This 
histogram separation alone should be sufficient for an identification, but it is even stronger 
proof of identification when an exact binary sequence can be objectively reproduced. 

Specific Example 

Imagine that we have taken a valuable picture of two heads of state at a cocktail 
party, pictures which are sure to earn some reasonable fee in the commercial market. We 
desire to sell this picture and ensure that it is not used in an unauthorized or uncompensated 
manner. This and the following steps are summarized in Fig. 2. 

Assume the picture is transformed into a positive color print. We first scan this into 
a digitized form via a normal high quality black and white scanner with a typical photometric 
spectral response curve. (It is possible to get better ultimate signal to noise ratios by scanning 
in each of the three primary colors of the color image, but this nuance is not central to 
describing the core process.) 

Let us assume that the scanned image now becomes a 4000 by 4000 pixel 
monochrome digital image with a grey scale accuracy defined by 12-bit grey values or 4096 
allowed levels. We will call this the "original digital image" realizing that this is the same as 
our "original signal" in the above definitions. 

During the scanning process we have arbitrarily set absolute black to correspond to 
digital value '30'. We estimate that there is a basic 2 Digital Number root mean square noise 
existing on the original digital image, plus a theoretical noise (known in the industry as "shot 
noise") of the square root of the brightness value of any given pixel. In formula, we have: 

<RMS Noise n , m > « sqrt(4 4- (V n , m -30)) (1) 

Here, n and m are simple indexing values on rows and columns of the image ranging from 0 
to 3999. Sqrt is the square root. V is the DN of a given indexed pixel on the original digital 
image. The < > brackets around the RMS noise merely indicates that this is an expected 
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average value, where it is clear that each and every pixel will have a random error 
individually. Thus, for a pixel value having 1200 as a digital number or "brightness value", 
we find that its expected rms noise value is sqrt(1204) = 34.70, which is quite close to 
34.64, the square root of 1200. 

We furthermore realize that the square root of the innate brightness value of a pixel 
is not precisely what the eye perceives as a minimum objectionable noise, thus we come up 
with the formula: 

<RMS Addable Noise Q , m > = X*sqrt(4+(V M -30)"Y) (2) 

Where X and Y have been added as empirical parameters which we will adjust, and "addable" 
noise refers to our acceptable perceived noise level from the definitions above. We now 
intend to experiment with what exact value of X and Y we can choose, but we will do so at 
the same time that we are performing the next steps in the Eye-D process. 

The next step in our process is to choose N of our N-bit identification word. We 
decide that a 16 bit main identification value with its 65536 possible values will be sufficiently 
large to identify the image as ours, and that we will be directly selling no more than 128 
copies of the image which we wish to track, giving 7 bits plus an eighth bit for an odd/even 
adding of the first 7 bits (i.e. an error checking bit on the first seven). The total bits required 
now are at 4 bits for the 0101 calibration sequence, 16 for the main identification, 8 for the 
version, and we now throw in another 4 as a further error checking value on the first 28 bits, 
giving 32 bits as N. The final 4 bits can use one of many industry standard error checking 
methods to choose its four values. 

We now randomly determine the 16 bit main identification number, finding for ~ 
example, 1101 0001 1001 1110; our first versions of the original sold will have all 0*s as the 
version identifier, and the error checking bits will fall out where they may. We now have our 
unique 32 bit identification word which we will embed on the original digital image. 

To do this, we generate 32 independent random 4000 by 4000 encoding images for 
each bit of our 32 bit identification word. The manner of generating these random images is 
revealing. There are numerous ways to generate these. By far the simplest is to turn up the 
gain on the same scanner that was used to scan in the original photograph, only this time 
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placing a pure black image as the input, then scanning this 32 times. The only drawback to 
this technique is that it does require a large amount of memory and that "fixed pattern" noise 
will be part of each independent "noise image." But, the fixed pattern noise can be removed 
via normal "dark frame" subtraction techniques. Assume that we set the absolute black 
average value at digital number '100/ and that rather than finding a 2 DN rms noise as we 
did in the normal gain setting, we now find an rms noise of 10 DN about each and every 
pixel's mean value. 

We next apply a very mid-spatial-frequency bandpass filter (spatial convolution) to 
each and every independent random image, essentially removing the very high and the very 
low spatial frequencies from them. We remove the very low frequencies because simple 
real-world error sources like geometrical warping, splotches on scanners, mis-registrations, 
and the like will exhibit themselves most at lower frequencies also, and so we want to 
concentrate our identification signal at higher spatial frequencies in order to avoid these types 
of corruptions. Likewise, we remove the higher frequencies because multiple generation 
copies of a given image, as well as compression-decompression transformations, tend to wipe 
out higher frequencies anyway, so there is no point in placing too much identification signal 
into these frequencies if they will be the ones most prone to being attenuated. Therefore, our 
new filtered independent noise images will be dominated by mid-spatial frequencies. On a 
practical note, since we are using 12-bit values on our scanner and we have removed the DC 
value effectively and our new rms noise will be slightly less than 10 digital numbers, it is 
useful to boil this down to a 6-bit value ranging from -32 through 0 to 31 as the resultant 
random image. 

Next we add all of the random images together which have a T in their 
corresponding bit value of the 32-bit identification word, accumulating the result in a 16-bit ~~ 
signed integer image. This is the unattenuated and un-scaled version of the composite 
embedded signal. 

Next we experiment visually with adding the composite embedded signal to the 
original digital image, through varying the X and Y parameters of equation 2. In formula, 
we visually iterate to both maximize X and to find the appropriate Y in the following: 

Vdut^i.m — V^jnm + V c ^.^ m *X*sqrt(4+ V^.,^ Y) 
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where dist refers to the candidate distributable image, i.e. we are visually iterating to find 
what X and Y will give us an acceptable image; orig refers to the pixel value of the original 
image; and comp refers to the pixel value of the composite image. The n*s and m*s still 
index rows and columns of the image and indicate that this operation is done on all 4000 by 
4000 pixels. The symbol V is the DN of a given pixel and a given image. 

As an arbitrary assumption, now, we assume that our visual experimentation has 
found that the value of X= 0.025 and Y=0.6 are acceptable values when comparing the 
original image with the candidate distributable image. This is to say, the distributable image 
with the "extra noise* is acceptably close to the original in an aesthetic sense. Note that since 
our individual random images had a random rms noise value around 10 DN, and that adding 
approximately 16 of these images together will increase the composite noise to around 40 DN, 
the X multiplication value of 0.025 will bring the added rms noise back to around 1 DN, or 
half the amplitude of our innate noise on the original. This is roughly a 1 dB gain in noise at 
the dark pixel values and correspondingly more at the brighter values modified by the Y value 
of 0.6. 

So with these two values of X and Y, we now have constructed our first versions of 
a distributable copy of the original. Other versions will merely create a new composite signal 
and possibly change the X slightly if deemed necessary. We now lock up the original digital 
image along with the 32-bit identification word for each version, and the 32 independent 
random 4-bit images, waiting for our first case of a suspected piracy of our original. Storage 
wise, this is about 14 Megabytes for the original image and 32*0.5bytes*16 million = —256 
Megabytes for the random individual encoded images. This is quite acceptable for a single 
valuable image. Some storage economy can be gained by simple lossless compression. 

Finding a Suspected Piracy of our Imag e 

We sell our image and several months later find our two heads of state in the exact 
poses we sold them in, seemingly cut and lifted out of our image and placed into another 
stylized background scene. This new "suspect" image is being printed in 100,000 copies of a 
given magazine issue, let us say. We now go about determining if a portion of our original 
image has indeed been used in what is clearly an unauthorized manner. Fig. 3 summarizes 
the details. 
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The first step is to take an issue of the magazine, cut out the page with the image on 
it, then carefully but not too carefully cut out the two figures from the background image 
using ordinary scissors. If possible, we will cut out only one connected piece rather than the 
two figures separately. We paste this onto a black background and scan this into a digital 
form. Next we electronically flag or mask out the black background, which is easy to do by 
visual inspection. 

We now procure the original digital image from our secured place along with the 
32-bit identification word and the 32 individual embedded images. We place the original 
digital image onto our computer screen using standard image manipulation software, and we 
roughly cut along the same borders as our masked area of the suspect image, masking this 
image at the same time in roughly the same manner. The word 'roughly' is used since an 
exact cutting is not needed, it merely aids the identification statistics to get it reasonably close. 

Next we rescale the masked suspect image to roughly match the size of our masked 
original digital image, that is, we digitally scale up or down the suspect image and roughly 
overlay it on the original image. Once we have performed this rough registration, we then 
throw the two images into an automated scaling and registration program. The program 
performs a search on the three parameters of x position, y position, and spatial scale, with the 
figure of merit being the mean squared error between the two images given any given scale 
variable and x and y offset. This is a fairly standard image processing methodology. 
Typically this would be done using generally smooth interpolation techniques and done to 
sub-pixel accuracy. The search method can be one of many, where the simplex method is a 
typical one. 

Once the optimal scaling and x-y position variables are found, next comes another 
search on optimizing the black level, brightness gain, and gamma of the two images. Againr 
the figure of merit to be used is mean squared error, and again the simplex or other search 
methodologies can be used to optimize the three variables. After these three variables are 
optimized, we apply their corrections to the suspect image and align it to exactly the pixel 
spacing and masking of the original digital image and its mask. We can now call this the 
standard mask. 
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The next step is to subtract the original digital image from the newly normalized 
suspect image only within the standard mask region. This new image is called the difference 
image. 

Then we step through all 32 individual random embedded images, doing a local 
cross-correlation between the masked difference image and the masked individual embedded 
image. 'Local* refers to the idea that one need only start correlating over an offset region of 
+/- 1 pixels of offset between the nominal registration points of the two images found during 
the search procedures above. The peak correlation should be very close to the nominal 
registration point of 0,0 offset, and we can add the 3 by 3 correlation values together to give 
one grand correlation value for each of the 32 individual bits of our 32-bit identification 
word. 

After doing this for all 32 bit places and their corresponding random images, we 
have a quasi-floating point sequence of 32 values. The first four values represent our 
calibration signal of 0101. We now take the mean of the first and third floating point value 
and call this floating point value '0/ and we take the mean of the second and the fourth value 
and call this floating point value *1.' We then step through all remaining 28 bit values and 
assign either a '0* or a '1' based simply on which mean value they are closer to. Stated 
simply, if the suspect image is indeed a copy of our original, the embedded 32-bit resulting 
code should match that of our records, and if it is not a copy, we should get general 
randomness. The third and the fourth possibilities of 3) Is a copy but doesn't match 
identification number and 4) isn't a copy but does match are, in the case of 3), possible if the 
signal to noise ratio of the process has plummeted, i.e. the 'suspect image' is truly a very 
poor copy of the original, and in the case of 4) is basically one chance in four billion since we 
were using a 32-bit identification number. If we are truly worried about 4), we can just have 
a second independent lab perform their own tests on a different issue of the same magazine. 
Finally, checking the error-check bits against what the values give is one final and possibly 
overkill check on the whole process. In situations where signal to noise is a possible 
problem, these error checking bits might be eliminated without too much harm. 

Assuming that a positive identification is made, we must now decide what to do 

about it. 
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Benefits of the Eve-D Method 

Now that a fall description of the preferred embodiment has been described via a 
detailed example, it is appropriate to point out the rationale of some of the process steps and 
their benefits. 

The ultimate benefits of the process are that obtaining an identification number is 
fully independent of the manners and methods of preparing the difference image. That is to 
say, the manners of preparing the difference image, such as cutting, registering, scaling, 
etcetera, cannot increase the odds of finding an identification number when none exists; it 
only helps the signal-to-noise ratio of the identification process when a true identification 
number is present. Methods of preparing images for identification can be different from each 
other even, providing the possibility for multiple independent methodologies for making a 
match. 

The ability to obtain a match even on sub-sets of the original signal or image is a key 
point in today's information-rich world. Cutting and pasting both images and sound clips is 
becoming more common, thus Eye-D provides a method whereby identification can still be 
performed even when original material has been thus corrupted. Finally, the signal to noise 
ratio of matching should begin to become difficult only when the copy material itself has been 
significantly altered either by noise or by significant distortion; both of these also will affect 
that copy's commercial value, so that trying to thwart the system can only be done at the 
expense of a huge decrease in commercial value. 

The fullest expression of the Eye-D system will come when it becomes an industry 
standard and numerous independent groups set up with their own means or 'in-house* brand 
of applying embedded identification numbers and in their decipherment. Numerous 
independent group identification will further enhance the ultimate objectivity of the method, 
thereby enhancing its appeal as an industry standard. 

Use of True Polarity in Creating the Composite Embedded Code Signal 

The foregoing disclosure made use of the 0 and 1 formalism of binary technology to 
accomplish its ends. Specifically, the O's and Ts of the N-bit identification word directly 
multiplied their corresponding individual embedded code signal to form the composite 
embedded code signal (step 8, figure 2). This approach certainly has its conceptual 
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simplicity, but the multiplication of an embedded code signal by 0 along with the storage of 
that embedded code contains a kind of inefficiency. 

It is preferred to maintain the formalism of the 0 and 1 nature of the N-bit 
identification word, but to have the O's of the word induce a subtraction of their 
corresponding embedded code signal. Thus, in step 8 of figure 2, rather than only 'adding 1 
the individual embedded code signals which correspond to a *V in the N-bit identification 
word, we will also 'subtract' the individual embedded code signals which correspond to a '0* 
in the N-bit identification word. 

At first glance this seems to add more apparent noise to the final composite signal. 
But it also increases the energy-wise separation of the O's from the l's, and thus the 'gain* 
which is applied in step 10, figure 2 can be correspondingly lower. 

We can refer to this improvement as the use of true polarity. The main advantage of 
this improvement can largely be summarized as 'informational efficiency.' 

'Perceptual Orthogonality* of the Individual Embedded Code Signals 

The foregoing disclosure contemplates the use of generally random noise-like signals 
as the individual embedded code signals. This is perhaps the simplest form of signal to 
generate. However, there is a form of informational optimization which can be applied to the 
set of the individual embedded signals which the applicant describes under the rubric 
'perceptual orthogonality.' This term is loosely based on the mathematical concept of the 
orthogonality of vectors, with the current additional requirement that this orthogonality should 
maximize the signal energy of the identification information while maintaining it below some 
perceptibility threshold. Put another way, the embedded code signals need not necessarily be 
random in nature. 

Use and Improvements of the Invention in the Field of Emulsion-Based Photography 

The foregoing portions of this disclosure outlined techniques that are applicable to 

photographic materials. The following section explores the details of this area further and 

discloses certain improvements which lend themselves to a broad range of applications. 

The first area to be discussed involves the pre-application or pre-exposing of a serial 

number onto traditional photographic products, such as negative film, print paper, 
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transparencies, etc. In general, this is a way to embed a priori unique serial numbers (and by 
implication, ownership and tracking information) into photographic material. The serial 
numbers themselves would be a permanent part of the normally exposed picture, as opposed 
to being relegated to the margins or stamped on the back of a printed photograph, which all 
require separate locations and separate methods of copying. The 'serial number* as it is 
called here is generally synonymous with the N-bit identification word, only now we are 
using a more common industrial terminology. 

In Figure 2, step 11, the disclosure calls for the storage of the "original [image] " 
along with code images. Then in figure 3, step 9, it directs that the original be subtracted 
from the suspect image, thereby leaving the possible identification codes plus whatever noise 
and corruption has accumulated. Therefore, the previous disclosure made the tacit assumption 
that there exists an original without the composite embedded signals. 

Now in the case of selling print paper and other duplication film products, this will 
still be the case, i.e., an "original" without the embedded codes will indeed exist and the basic 
methodology of the invention can be employed. The original film serves perfectly well as an 
'unencoded original.' 

However, in the case where pre-exposed negative film is used, the composite 
embedded signal pre-exists on the original film and thus there will never be an "original" 
separate from the pre-embedded signal. It is this latter case, therefore, which will be 
examined a bit more closely along with various remedies on how to use the basic principles of 
the invention (the former cases adhering to the previously outlined methods). 

The clearest point of departure for the case of pre-numbered negative film, i.e. 
negative film which has had each and every frame pre-exposed with a very faint and unique 
composite embedded signal, comes at step 9 of figure 3 as previously noted. There are ~" 
certainly other differences as well, but they are mostly logistical in nature such as how and 
when to embed the signals on the film, how to store the code numbers and serial number, etc. 
Obviously the pre-exposing of film would involve a major change to the general mass 
production process of creating and packaging film. 

Fig. 4 has a schematic outlining one potential post-hoc mechanism for pre-exposing 
film. 'Post-hoc' refers to applying a process after the foil common manufacturing process of 
film has already taken place. Eventually, economies of scale may dictate placing this pre- 
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exposing process directly into the chain of manufacturing film. Depicted in Fig. 4 is what is 
commonly known as a film writing system. The computer, 106, displays the composite signal 
produced in step 8, figure 2, on its phosphor screen. A given frame of film is then exposed 
by imaging this phosphor screen, where the exposure level is generally very faint, i.e. 
generally imperceptible. Clearly, the marketplace will set its own demands on how faint this 
should be, that is, the level of added 'graininess* as practitioners would put it. Each frame of 
film is sequentially exposed, where in general the composite image displayed on the CRT 102 
is changed for each and every frame, thereby giving each frame of film a different serial 
number. The transfer lens 104 highlights the focal conjugate planes of a film frame and the 
CRT face. 

Getting back to the applying the principles of the invention in the case of pre-exposed 
negative film... At step 9, figure 3, if we were to subtract the "original" with its embedded 
code, we would obviously be "erasing" the code as well since the code is an integral part of 
the original. Fortunately, remedies do exist and identifications can still be made. However, 
it will be a challenge to artisans who refine this invention to have the signal to noise ratio of 
the identification process in the pre-exposed negative case approach the signal to noise ratio of 
the case where the un-encoded original exists. 

A succinct definition of the problem is in order at this point. Given a suspect picture 
(signal), find the embedded identification code IF a code exists at al. The problem reduces to 
one of finding the amplitude of each and every individual embedded code signal within the 
suspect picture, not only within the context of noise and corruption as was previously 
explained, but now also within the context of the coupling between a captured image and the 
codes. 'Coupling* here refers to the idea that the captured image "randomly biases" the 
cross-correlation. - _ 

So, bearing in mind this additional item of signal coupling, the identification process 
now estimates the signal amplitude of each and every individual embedded code signal (as 
opposed to taking the cross-correlation result of step 12, figure 3). If our identification signal 
exists in the suspect picture, the amplitudes thus found will split into a polarity with positive 
amplitudes being assigned a '1' and negative amplitudes being assigned a '0'. Our unique 
identification code manifests itself. If, on the other hand, no such identification code exists or 
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it is someone else's code, then a random gaussian-Iike distribution of amplitudes is found with 
a random hash of values. 

It remains to provide a few more details on how the amplitudes of the individual 
embedded codes are found. Again, fortunately, this exact problem has been treated in other 
technological applications. Besides, throw this problem and a little food into a crowded room 
of mathematicians and statisticians and surely a half dozen optimized methodologies will pop 
out after some reasonable period of time. It is a rather cleanly defined problem. 

One specific example solution which is also the current preferred embodiment comes 
from the field of astronomical imaging. Here, it is a mature prior art to subtract out a 
"thermal noise frame" from a given CCD image of an object. Often, however, it is not 
precisely known what scaling factor to use in subtracting the thermal frame and a search for 
the correct scaling factor is performed. This is precisely the task of this step of the present 
invention. 

General practice merely performs a common search algorithm on the scaling factor, 
where a scaling factor is chosen and a new image is created according to: 

NEW IMAGE = ACQUIRED IMAGE - SCALE * THERMAL IMAGE 
The new image is applied to the fast fourier transform routine and a scale factor is 
eventually found which minimizes the integrated high frequency content of the new image. 
This general type of search operation with its minimization of a particular quantity is 
exceedingly common. The scale factor thus found is the "amplitude" being sought within the 
steps of the present invention. Refinements which are contemplated but not yet implemented 
are where the coupling of the higher derivatives of the acquired image and the embedded 
codes are estimated and removed from the calculated scale factor. In other words, certain 
bias effects from the coupling mentioned earlier are present and should be eventually — 
accounted for and removed both through theoretical and empirical experimentation. 

Use and Improvements of the Invention in the Detection of Signal or Image Alteration 

Apart from the basic need of identifying a signal or image as a whole, there is also a 
rather ubiquitous need to detect possible alterations to a signal or image. The following 
section describes how the present invention, with certain modifications and improvements, can 
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be used as a powerful tool in this area. The potential scenarios and applications of detecting 
alterations are innumerable. 

To first summarize, assume that we have a given signal or image which has been 
positively identified using the basic methods outlined in the foregoing disclosure. In other 
words, we know its N-bit identification word, its individual embedded code signals, and its 
composite embedded code. We can then fairly simply create a spatial map of the composite 
code's amplitude within our given signal or image. Furthermore, we can divide this 
amplitude map by the known composite code's spatial amplitude, giving a normalized map, 
i.e. a map which should fluctuate about some global mean value. By simple examination of 
this map, we can visually detect any areas which have been significandy altered wherein the 
value of the normalized amplitude dips below some statistically set threshold based purely on 
typical noise and corruption (error). 

The details of implementing the creation of the amplitude map have a variety of 
choices. The preferred embodiment at this time is to perform the same procedure which is 
used to determine the signal amplitude as described above, only now we step and repeat the 
multiplication of any given area of the signal/image with a gaussian weight function centered 
about the area we are investigating. 

Universal Versus Custom Codes 

The disclosure thus far has outline how each and every source signal has its own 
unique set of individual embedded code signals. This clearly entails the storage of a 
significant amount of additional code information above and beyond the original, and many 
applications may merit some form of economizing. 

One such approach to economizing is to have a given set of individual embedded _ 
code signals be common to a batch of source materials. For example, one thousand images 
can all utilize the same basic set of individual embedded code signals. The storage 
requirements of these codes then become a small fraction of the overall storage requirements 
of the source material. 

Furthermore, some applications can utilize a universal set of individual embedded 
code signals, i.e., codes which remain the same for all instances of distributed material. This 
type of requirement would be seen by systems which wish to hide the N-bit identification 
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word itself, yet have standardized equipment be able to read that word. This can be used in 
systems which make go/no go decisions at point-of-read locations. The potential drawback to 
this set-up is that the universal codes are more prone to be sleuthed or stolen; therefore they 
will not be as secure as the apparatus and methodology of the previously disclosed 
arrangement. Perhaps this is just the difference between 'high security* and *air-tight 
security/ a distinction carrying little weight with the bulk of potential applications. 

Use of the Invention in Printing. Paper. Documents. Plastic Coated Identification Cards, and 
Other Material Where Global Embedded Codes Can Be Imprinted 

The term 'signal' in the title of the disclosure is often used narrowly to refer to 
digital data information, audio signals, images, etc. A broader interpretation of 'signal,* and 
the one more generally intended, includes any form of modulation of any material 
whatsoever. Thus, the micro-topology of a piece of common paper becomes a 'signal' (e.g. it 
height as a function of x-y coordinates). The reflective properties of a flat piece of plastic (as 
a function of space also) becomes a signal. The point is that photographic emulsions, audio 
signals, and digitized information are not the only types of signals capable of utilizing the 
principles of the invention. 

As a case in point, a machine very much resembling a braille printing machine can 
be designed so as to imprint unique 'noise-like' indentations as outlined in the disclosure. 
These indentations can be applied with a pressure which is much smaller than is typically 
applied in creating braille, to the point where the patterns are not noticed by a normal user of 
the paper. But by following the steps of the present disclosure and applying them via the 
mechanism of micro-indentations, a unique identification code can be placed onto any given 
sheet of paper, be it intended for everyday stationary purposes, or be it for important ~~ 
documents, legal tender, or other secured material. 

The reading of the identification material in such an embodiment generally proceeds 
by merely reading the document optically at a variety of angles. This would become an 
inexpensive method for deducing the micro-topology of the paper surface. Certainly other 
forms of reading the topology of the paper are possible as well. 

In the case of plastic encased material such as identification cards, e.g. driver's 
licenses, a similar braille-like impressions machine can be utilized to imprint unique 



21 



40333.pa 3/17/94 

identification codes. Subtle layers of photoreactive materials can also be embedded inside the 
plastic and 'exposed/ 

It is clear that wherever a material exists which is capable of being modulated by 
'noise-like* signals, that material is an appropriate carrier for unique identification codes and 
utilization of the principles of the invention. The trick becomes one of economically applying 
the identification information and maintaining the signal level below an acceptability threshold 
which each and every application will define for itself. 

A ppendix A Description 

Appendix A contains the source code of an implementation and verification of the 
Eye-D system on an 8-bit black and white imaging system. 

Conclusion 

Having described and illustrated the principles of my invention with reference to an 
illustrative embodiment and several variations thereof, it should be apparent that the invention 
can be modified in arrangement and detail without departing from such principles. 
Accordingly, I claim as my invention all such embodiments as come within the scope and 
spirit of the following claims and equivalents thereto. 
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